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Abstract. XML database query languages sucii as XQuery employ regular ex- 
pression types with structural subtyping. Subtyping systems typically have two 
presentations, which should be equivalent: a declarative version in which the 
subsumption rule may be used anywhere, and an algorithmic version in which 
the use of subsumption is limited in order to make typechecking syntax-directed 
and decidable. However, the XQuery standard type system circumvents this issue 
by using imprecise typing rules for iteration constructs and defining only algo- 
rithmic typechecking, and another extant proposal provides more precise types 
for iteration constructs but ignores subtyping. In this paper, we consider a core 
XQuery-like language with a subsumption rule and prove the completeness of 
algorithmic typechecking; this is straightforward for XQuery proper but requires 
some care in the presence of more precise iteration typing disciplines. We extend 
this result to an XML update language we have introduced in earlier work. 



1 Introduction 

The Extensible Markup Language (XML) is a World Wide Web Consortium (W3C) 
standard for tree-structured data. Regular expression types for XML |Tj| have been 
studied extensively in XML processing languages such as XDuce 1 12| and CDuce UJ, 
as well as projects to extend general-purpose programming languages with XML fea- 
tures such as Xtatic f9'| and OCamlDuce 1 8 1 . 

Several other W3C standards, such as XQuery, address the use of XML as a general 
format for representing data in databases. Static typechecking is important in XML 
database applications because type information is useful for optimizing queries and 
avoiding expensive run- time checks and revalidation. The XQuery standard |5 j provides 
for structural subtyping based on regular expression types. 

However, XQuery's type system is imprecise in some situations involving itera- 
tion (f or-expressions). In particular, if the variable $x has typ^H a[6[]*, c[]^], then the 
XQuery expression 

for $y in $x/ * return $y 

has type (&[]|c[])* in XQuery, but in fact the result will always match the regular ex- 
pression type &[]*,c[]'. The reason for this inaccuracy is that XQuery's type system 
typechecks a for loop by converting the type of the body of the expression (here, $x/a 



We use the notation for regular expression types from Hosoya, Vouillon and Pierce 1 131 in 
preference to the more verbose XQuery or XML Schema syntaxes. 



with type c[]') to the "factored" form (ai| . . . |a„)^, where g is a quantifier such 
as ?, +, or * and each ai is an atomic type (i.e. a data type such as string or single 
element type a[T]). 

More precise type systems have been contemplated for XQuery-like languages, 
including a precursor to XQuery designed by Fernandez, Simeon, and Wadler Q. 
More recently, Colazzo et al. [41 have introduced a core XQuery language called /iXQ, 
equipped with a regular expression-based type system that provides more precise types 
for iterations using techniques similar to those in Q. In ^XQ, the above expression can 
be assigned the more accurate type b[]*, c[] ■ . 

Accurate typing for iteration constructs is especially important in typechecking 
XML updates. We are developing a statically-typed update language called Flux |[3l in 
which ideas from /iXQ are essential for typechecking updates involving iteration. Using 
XQuery-style factoring for iteration in Flux would make it impossible to typecheck 
updates that modify data without modifying the overall schema of the database — a very 
common case. For example, using XQuery-style factoring for iteration in Flux, we 
would not be able to verify statically that given a database of type a[5[string]*, c[]^], 
an update that modifies the text inside some of the b elements produces an output that 
is still of type a[5[string]*, c[]'^], rather than a[(6[string] |c[])*]. 

One question left unresolved in previous work on both fj,XQ and Flux is the rela- 
tionship between declarative and algorithmic presentations of the type system (in the 
terminology of lfT4l Ch. 15-16]). Declarative derivations permit arbitrary uses of the 
subsumption rule: 

r \- e : T T <: t' 
r h e : r' 

whereas algorithmic derivations limit the use of this rule in order to ensure that type- 
checking is syntax-directed and decidable. The declarative and algorithmic presenta- 
tions of a system should agree. If they do, then declarative typechecking is decidable; 
if they disagree, then the algorithmic system is incomplete relative to the high-level 
declarative system: it rejects programs that should typecheck. 

The XQuery standard circumvented this issue by directly defining typechecking to 
be algorithmic. In contrast, neither subsumption nor subtyping were considered in /iXQ, 
in part because subtyping interacts badly with /iXQ's "path correctness" analysis (as ar- 
gued by Colazzo et al. |4), Section 4.4). Subsumption was considered in our initial work 
on Flux |3 |, but we were initially unable to establish that declarative typechecking was 
decidable, even in the absence of recursion in types, queries, or updates. 

In this paper we consider declarative typechecking for /iXQ and Flux extended 
with recursive types, recursive functions, and recursive update procedures. To estab- 
lish that typechecking remains decidable, it suffices (following Pierce [14, Ch. 16]) 
to define an algorithmic typechecking judgment and prove its completeness; that is, 
that declarative derivations can always be normalized to algorithmic derivations. For 
XQuery proper, this appears straightforward because of the use of factoring when type- 
checking iterations. However, for /iXQ's more precise iteration type discipline, com- 
pleteness of algorithmic typechecking does not follow by the "obvious" structural in- 
duction. Instead, we must establish a stronger property by considering the structure of 
regular expression types. We also extend these results to Flux. 
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The structure of the rest of the paper is as follows. Section |2] reviews regular ex- 
pression types and subtyping. Section [3] introduces the core language /iXQ, discusses 
examples highlighting the difficulties involving subtyping in fiXQ, and proves decid- 
ability of declarative typechecking. We also review the Flux core update language in 
Section |4] discuss examples, and extend the proof of decidabiUty of declarative type- 
checking to Flux. Sections |5}{6] sketch related and future work and conclude. 

2 Background 

For the purposes of this paper, XML values are trees built up out of booleans b G Bool = 
{true, false}, strings w € E* over some alphabet U, and labels l,m,n G Lab, 
according to the following syntax: 

V ::— b \ w \ n[v] v ::= v,v \ { ) 

Values include tree values v G Tree and forest values v G Val. We write v, v' for the 
result of appending two forest values (considered as lists). 

We consider a regular expression type system with structural subtyping, similar to 
those considered in several transformation and query languages for XML II13I4I7I . The 
syntax of types and type environments is as follows. 

Atomic types a ::= bool | string | n[T] 
Sequence types r ::= a | () | t\t' \ t,t' \ t* \ X 
Type definitions to a | ( ) | tqIt^ | tq, Tq | Tq* 
Type signatures E ::— ■ \ E, type X ^ tq 

We call types of the form a G Atom atomic types (or sometimes tree or singular types), 
and types r G Type of all other forms sequence types (or sometimes forest or plural 
types). It should be obvious that a value of singular type must always be a sequence 
of length one (that is, a tree); plural types may have values of any length. There exist 
plural types with only values of length one, but which are not syntactically singular 
(for example int|bool). As usual, the + and ? quantifiers can be defined as follows: 
T+ = T, r* and t' = r| () . We abbreviate n[ () ] as n[]. 

Note that in contrast to Hosoya et al. [[T3l . but following Colazzo et al. yj, we 
include both Kleene star and type variables. In |fT3l, it was shown that Kleene star can 
be translated away by introducing type variables and definitions, modulo a syntactic 
restriction on top-level occurrences of type variables. In contrast, we allow Kleene star, 
but further restrict type variables. Recursive and mutually recursive declarations are 
allowed, but type variables may not appear at the top level of a type definition tq: for 
example, type X = nil[]\cons{a, X) and type Y = leaf[]\node[X,X] are allowed 
but type X' ^ {) \a[],X and type Y' = b[]\Y',Y' are not. The equation for X' 
defines the regular tree language a[]*, and would be permitted in XDuce, while that for 
Y' defines a context-free tree language that is not regular. 

An environment E is well-formed if all type variables appearing in definitions are 
themselves declared in E. Given a well-formed environment E, we write E{X) for the 
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definition of X. A type denotes the set of values \t\e, defined as follows. 

[string] B = i;* IhooljE^Bool |()]b = {()} 

[n[r]l£ = {n[v] \ v G Me} {XJe - mx)j lr\r']E = Me U MJe 
Mr']E = {v,v'\veME,v' eM]E} 
MjE = { ( ) } U ...,vn\vie Me, ■■■,vne Me} 

Formally, {tJe must be defined by a least fixed point construction which we take for 
granted. Henceforth, we treat E as fixed and define |r] = {tJe- 

In addition, we define a binary subtyping relation on types. A type ti is a subtype 
of T2 (ti <: T2), by definition, if |ri] C |t2]. Our types can be translated to XDuce 
types, so subtyping reduces to XDuce subtyping; although this problem is EXPTIME- 
complete in general, the algorithm of |13 | is well-behaved in practice. Therefore, we 
shall not give explicit inference rules for checking or deciding subtyping, but treat it as 
a "black box". 



3 Query language 

We review an XQuery-like core language based on jXXQ |4J. In /iXQ, we distinguish 
between tree variables x G TVar, introduced by for, and forest variables, x G Var, 
introduced by let. We write x G Var U TVar for an arbitrary variable. The other 
syntactic classes of our variant of /iXQ include booleans, strings, and labels introduced 
above, function names F e FSym, expressions e G Expr, and programs p G Prog; the 
abstract syntax of expressions and programs is defined as follows: 

e ::= ( ) | e, e' | n[e] \ w \ x \ let a; = e in e' | F{ei, . . . , e„) 

I 5 I if c then e else e' | a; | x/child | e :: n | for x G e return e' 
p ::= query e : t | declare function F{xi: ■Tn) ■■ T {e}; p 

The distinguished variables x in for x E e return e'{x) and x in let a; = e in e'{x) 
are bound in e'{x). Here and elsewhere, we employ common conventions such as con- 
sidering expressions containing bound variables equivalent up to a-renaming and em- 
ploying a richer concrete syntax including parentheses. 

To simplify the presentation, we split /iXQ's projection operation i/child :: I into 
two expressions: child projection (x/child) which returns the children of x, and node 
name filtering (e : : n) which evaluates e to an arbitrary sequence and selects the nodes 
labeled n. Thus, the ordinary child axis expression x/child :: n is syntactic sugar for 
(x/child) :: n and the "wildcard" child axis is definable as x/child :: * = x/child. 
Built-in operations such as string equality may be provided as additional functions F. 

Colazzo et al. |4 ) provided a denotational semantics of fiXQ queries with the descen- 
dant axis but without recursive functions. This semantics is sound with respect to the 
typing rules in the next section and can be extended to handle recursive functions using 
operational techniques (as in the XQuery standard). However, we omit the semantics 
since it is not needed in the rest of the paper. 
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r h 



b e Bool 



r \- X : a r h X : T F \- w : string F \- b : bool 
Fh e-.T F\- e-.T F ^ e' : r' T h ei : n F, x:ti h 62 : T2 



rh ( ) : ( ) r h n[e] : n[r] Th e, e' : r, r' T h let a; = ei in 62 : T2 

rh c : bool n- ei : n T h 62 : r2 x:n[T] e F F \- e : t t :: n ^ t' 



_r h if c then ei else 62 : ti\t2 F h s/child : r F h e :: n : t' 

r h ei : Ti r h s in n ^ 62 : r2 ^'(~) : to G F h et : n F h e : t t <: t' 



F h for a; £ ei return 62 : T2 



r h F(e) : TO 



Fh 



F \- p prog 



The 



The 



n[rj :: n 
X :; n => r 



F not declared in p F{t) :ro£Zi F, a;:rhe:To F h p prog 
r prog _r h declare function -F(r) : tq {e}; p prog 

Fig. 1. Query and program well-formedness rules 
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Fig. 2. Auxiliary judgments 



3.1 Type system 

Our type system for queries is essentially that introduced for fiXQ by [4 |, excluding the 
path correctness component. We consider typing environments F and global declaration 
environments A, defined as follows: 

r ::= • I r,x:T \ r,x:a Z\ • | A,F{t) : tq 

Note that in F, tree variables may only be bound to atomic types. As usual, we assume 
that variables in type environments are distinct; this convention implicitly constrains 
all inference rules. We also write F <: F' to indicate that dom(/^) — dom(_r') and 
r'{x) <: r{x) for all x e dom(r). 

The main typing judgment for queries is h e : r; we also define a program well- 
formedness judgment r \- p prog which typechecks the bodies of functions. Following 
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0, there are two auxiliary judgments, F \- x in t s : t' , used for typechecking 
f or-expressions, and t :: n ^ t', used for typechecking label matching expressions 
e :: n. The rules for these judgments are shown in Figures [T] and |2] 

We consider the typing rules to be implicitly parameterized by a fixed global dec- 
laration environment A. Functions in XQuery have global scope so we assume that 
the declarations for all the functions declared in the program have already been added 
to by a preprocessing pass. Additional declarations for built-in functions might be 
included in A as well. 

The rules involving type variables in Figure|2]look up the variable's definition in E. 
These judgments only inspect the top-level of a type; they do not inspect the contents 
of element types n[T]. Since type definitions tq have no top-level type variables, both 
judgments are terminating. (This was argued in detail by Colazzo et al. [T Lem. 4.6].) 

3.2 Examples 

We first revisit the example in the introduction in order to illustrate the operation of the 
rules. Recall that x/*is translated to a;/child in our core language. 

V 

x:a[b[]* ,c[]']h x/child : b[]* ,c[]' x:a[b[r ,c[]'] h y in b[]* ,c[]' ^ y : b[]' ,4]' 
x:a[bW* , cW'] h f or y G z/child return y : &[]*, c[|' 

where the subderivation V is 



x:a[b[]*, 


cl]\y:b[]^y 


■.b[] 


x:a[b[]* 


,c[]],J/:cQ h y : 


c[] 


x:a[6[]*,cQ 




■ y ■ 


b[] x:a[b[r,c[] 


'] 1- y in c[] 


y ■■ c[l 


x:a[feO*,c[]' 


IhyinfcQ*^ 


■ y ■ 


b[]* x:a[b[r,c[f 


]\-yin c[] ' 


y ■■ c[] ■ 




x:a[b[r,c[]''] 


^y 


inb[]*,c[]'' y : 







Note that this derivation does not use subsumption anywhere. Suppose we wished 
to show that the expression has type b[]* , {c[]'\dW*), a supertype of the above type. 
There are several ways to do this: first, we can simply use subsumption at the end 
of the derivation. Alternatively, we could have used subsumption in one of the sub- 
derivations such as ic:a[6[]*, c[] ■], y:c[] ■ I- y : c[] ', to conclude, for example, that 
x:a[b[]*,c[]'%y:c[]'' h y : cQ- This is valid since c[]- <: cO'ldQ*- 

Suppose, instead, that we actually wanted to show that the above expression has 
type (6[(i[]*] |c[] ■')*, also a supertype of the derived type. There are again several ways 
of doing this. Besides using subsumption at the end of the derivation, we might have 
used it on .T:a[6[]*, c[] ■ ] h x/child : 6[]*,c[]'- to obtain x:a[6[]*, c[] • ] h x/child : 
|c[] • )*. To complete the derivation, we would then need to replace derivation V 
with V: 

:r:a[b[l*,c[]'],y:cO \- y : c[] 

x:a[b[]*,cW\y:b[d[r]hy:b[d[r] x:a[b[]' ,c[]'] ^ y in c[] ^ y : c[] 

x:a[fe[l*,c[]-n h y inb[d[r] ^ y : b[d[]'] x:a[b[r , c[]'] h y in c[]' ^ y : c[]' 
a;:ffl[b[l%ca-'']hyinb[d[]*]|c[]-^^y:b[rf[]1|c[]'^ 
x:a[b[r,c[]'] h y in {h[d[r]\c[Yr ^ y : (feMDIIcD''')* 
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Not only does V have different structure than T), but it also requkes subderivations that 
were not syntactically present in V. 

The above example illustrates why eliminating uses of subsumption is tricky. If sub- 
sumption is used to weaken the type of the first argument of a f or-expression according 
to r{ <: Ti, then we need to know that we can transform the corresponding derivation 
■D of _r h x in Ti ^ e : r2 to a derivation of P' of 7^ h a; in r{ — ^ e : for some 
T2 <: T2 - But as illustrated above, the derivations T) and V may bear little resemblance 
to one another. 

Now we consider a typechecking a recursive query. Suppose we have type Tree = 
iree[/ea/ [string] I node [Tree*]] and function definition 



This uses a construct e/n that is not in core /iXQ, but we can expand e/n to for y G 
e return y/child :: n; thus, we can derive a rule 



r ^ e : 1[t] t :: n ^ t' F \- e : 1[t] F \- y in. 1[t] ^ y/child :: n : t' 



Using this derived rule and the fact that x : Tree and the definition of Tree, we 
can see that x/leaf : /ea/[string] and x/node : node[Tree*]\, and so x/node/* : 
tree[leaf[stTing]\node[Tree*]]*. So each iteration of the f or-loop can be typechecked 
with z : tree [leaf [string] | node [ Tree*]] . To check the function call leaves (2), we need 
subsumption to see that tree[/ea/[string]|no(ie[Tree*]]* <: Tree. It follows that that 
leaves{z) : Zea/ [string]*, so the f or-loop has type (Zea/[string]*)*. Again using 
subsumption, we can conclude that 

x/leaf , leaves (x/node/*) : lea/[string], (Zea/[string]*)* <: /ea/[string]* . 

Notice that although we could have used subsumption in several more places, we really 
needed it in only two places: when typechecking a function call, and when checking the 
result of a function against its declared type. 

3.3 DecidabUity 

The standard approach (see e.g. Pierce [ l4 Ch. 16]) to deciding declarative typecheck- 
ing is to define algorithmic judgments that are syntax-directed and decidable, and then 
show that the algorithmic system is complete relative to the declarative system. 

Definition 1 (Algorithmic derivations). The algorithmic typechecking judgments F I* 
e : r and F [*■ x Iti tq ^ e : t are defined by taking the rules of Figure s\l\and^ 
removing the subsumption rule, and replacing the function application rule with 




r', j/:Z[t] h y/child : T t :: n ^ t' 



F, y'.l[T] h y/child :: n : t' 



F h e/n : t' 



F h for 1/ G e return y/ child :: n : t' 



F{T):TeF F\^e,:T[ t[ <: n 
F ^♦ Fie) : r 
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It is straightforward to show that algorithmic derivabiUty is decidable and sound 
with respect to the declarative system: 

Lemma 1 (Decidability). For any x, e, n, there exist computable partial functions /„, 
5e, hx^y such that for any F, tq, we have: 

1- fn{To) is the unique r such that tq :: n =^ t. 

2. (F) is the unique r such that F [*■ e : t, when it exists. 

3. hx^e{r, To) is the unique t such that r' h*- i in tq — > e : r, when it exists. 

Theorem 1 (Algorithmic Soundness). (1) If F \*- e : t is derivable then F \- e : t 
is derivable. (2) If F [*■ x ±n tq — > e : t is derivable then F \- x in tq e : t is 
derivable. 

The corresponding completeness property (the main result of this section) is: 

Theorem 2 (Algorithmic Completeness). (1) If F h e : t then there exists t' <: t 
such that F \*- e : t' . (2) If F \- x in ti — > e : r2 then there exists T2 <: T2 such that 
I* a; in Ti — > e : Tj. 

Given a decidable subtyping relation < :, a typical proof of completeness involves show- 
ing by induction that occurrences of the subsumption rule can be "permuted" down- 
wards in the proof past other rules, except for function applications. Completeness for 
/kXQ requires strengthening this induction hypothesis. To see why, recall the following 
rules: 

* * * 

r I- ei : Ti r, x:ti h 62 : r2 T h ei : n T h 2: in n — > 62 : r2 F \- e : t t :: n ^ t' 

r h let a; = ei in 62 : r2 F \- for x £ ei return £2 : r2 F \- e :: n : t' 

If the subderivation labeled * in the above rules follows by subsumption, however, we 
cannot do anything to get rid of the subsumption rule using the induction hypotheses 
provided by Theorem|2l Instead we need an additional lemma that ensures that the judg- 
ments are all downward monotonic. Downward monotonicity means, informally, that if 
make the "input" types in a derivable judgment smaller, then the judgment remains 
derivable with a smaller "output" type. 

Lemma 2 (Downward monotonicity). 

1. If Ti :: n =^ T2 and t[ <: ti then t[ :: n ^ ''"2 /or some t'^ <: T2 

2. If F W e : T and F' <: F then F' \*- e : t' for some t' <: t. 

3. If F [*■ X luTi ^ e : T2 and F' <: F and t[ <: ti then F' [*■ x 1b. t[ ^ e : T2 
for some <: T2. 

The downward monotonicity lemma is almost easy to prove by direct structural 
induction (simultaneously on all judgments). The cases for (2) involving expression- 
directed typechecking are all straightforward inductive steps; however, for the cases 
involving type-directed judgments, the induction steps do not go through. The difficulty 
is illustrated by the following cases. For derivations of the form 

Ti :: n ^ T2 r' h x in ti — > e : T2 
tI :: n ^ T2 h x in — > e : r| 
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we are stuck: knowing that t{ <: rj* does not necessarily tell us anything about a 
subtyping relationship between t{ and ri. For example, if t[ = aa and n = a, then we 
have aa <: a* but not aa < : a. Instead, we need to proceed by an analysis of regular 
expression types and subtyping. 

We briefly sketch the argument, which involves an excursion into the theory of 
regular languages over partially ordered alphabets. Here, the "alphabet" is the set of 
atomic types and the regular sets are the sets of sequences of atomic types that are 
subtypes of a type t. The homomorphic extension of a (possibly partial) function h : 
Atom Type on atomic types is defined as 

/i( ())=() h{a) = h{a) h{T*) = hir)* 

h{TuT2) = h{Ti),h{T2) h{n\T2) = h{n)\h{T2) h{X) = h{E{X)) 

(Note again that this definition is well-founded, since type variables cannot be expanded 
indefinitely.) If h is partial, then h is defined only on types whose atoms are in dom(/i). 
We can then show the foUowing general property of partial homomorphic extensions: 

Lemma 3. Ifh : Atom — ^ Type is downward monotonic, then its homomorphic exten- 
sion h : Type Type is downward monotonic. 

It then suffices to show that /„ and /ij are partial homomorphic extensions of 
downward monotone functions on atomic types; for /„, the required function is sim- 
ple and obviously monotone, and for hx,e{r, — ), the required generating function is 
ge{r, x:{—)). Thus, we need to show that and h^.e are downward monotonic and 
that hx,e{r, — ) is the partial homomorphic extension of ge{r, x:{—)) simultaneously 
by mutual induction. This, finally, is a straightforward induction over derivations. More 
detailed proofs are included in the appendix. 

4 Update language 

We now introduce the core Flux update language, which extends the syntax of queries 
with statements s e Stmt, procedure names P € PSym, tests S Test, directions 
d G Dir, and two new cases for programs: 

s ::= skip | s; s' | if e then s else s' \ let x = e in s | P(e) 

I insert e | delete | rename n \ snapshot a; in s | (pis \ d[s] 
(p ::= n I * I bool | string d ::= left | right | children | iter 
p ::= ■ • • I update s : t ^ t' \ declare procedure P{x ■.t):t=^t' {s}; p 

Updates include standard programming constructs such as the no-op skip, sequential 
composition, conditionals, and let-binding. The basic update operations include in- 
sertion insert e, which inserts a value into an empty part of the database; deletion 
delete, which deletes part of the database; and rename n, which renames a part of the 
database provided it is a single tree. The "snapshot" operation snapshot x in ,s binds 
X to part of the database and then applies an update s, which may refer to x. Note that 
the snapshot operation is the only way to read from the current database state. 
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Updates also include tests (j)ls which test the top-level type of a singular value and 
conditionally perform an update, otherwise do nothing. The node label test nls checks 
whether the tree is of type ri[r], and if so executes s; the wildcard test *?s checks that 
the value is a tree. Similarly, bool?s and string?s test whether a value is a boolean or 
string. The ? operator binds tightly; for example, (/)?s; s' ~ {(p'^s); s' . 

Finally, updates include navigation operators that change the selected part of the 
tree, and perform an update on the sub-selection. The left and right operators per- 
form an update (typically, an insert) on the empty sequence located to the left or right 
of a value. The children operator applies an update to the child list of a tree value. 
The iter operator applies an update to each tree value in a forest. 

We distinguish between singular (unary) updates which apply only when the con- 
text is a tree value and plural (multi-ary) updates which apply to a sequence. Tests (pis 
are always singular. The children operator applies a plural update to all of the chil- 
dren of a single node; the iter operator appUes a singular update to all of the elements 
of a sequence. Other updates can be either singular or plural in different situations. Our 
type system tracks multiplicity as well as input and output types in order to ensure that 
updates are well-behaved. 

Flux updates operate on a part of the database that is "in focus", which helps en- 
sure that updates are deterministic and relatively easy to typecheck. Only the navigation 
operations left, right, children, iter can change the focus. We lack space to for- 
malize the semantics of updates in the main body of the paper; the semantics of updates 
is essentially the same as in [3J except for the addition of procedures. 

4.1 Type system 

In typechecking updates, we extend the global declaration context A with procedure 
declarations: 

Z\ ::=■•• I A,P{t) : n ^ 

There are two typing judgments for updates: singular well-formedness F {a} s {r'} 
(that is, in type environment F, update s maps tree type a to type r'), and plural well- 
formedness F h* {t} s {t'} (that is, in type environment F, update s maps type r to 
type t')- Several of the rules are parameterized by a multiplicity a G {1, In addition, 
there is an auxiliary judgment F hter {t} s {t'} for typechecking iterations. The rules 
for update well-formedness are shown in Figure[3] We also need an auxiliary subtyping 
relation involving atomic types and tests: we say that a <: if |a] C |(/)]. This is 
characterized by the rules: 



bool <: bool string <: string ?i[r] <: n ri[r] <: * 

Remark 7. In most other XML update proposals (including XQuery ! IITTl and the draft 
XQuery Update Facility (21), side -effecting update operations are treated as expressions 
that return ( ) . Thus, we could perhaps typecheck such updates as expressions of type 
( ) . This would work fine as long as the types of values reachable from the free vari- 
ables in F can never change; however, the updates available in these languages can and 
do change the values of variables. Thus, to make this approach sound F would to be 
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r^-{T}s{T'} 

r {r} s {t'} r {r'} s' {r"} Their T, x:r {n} s {ra} 
r h" {r} skip {r} r {t} s- s' {t"} F h° {n} let x = e ins {ra} 

i-l r:bool r\" {T}s{n} r\" {T}.'{r2} I\.v:t\'- {r}.{r'} 

r {t} if e then s else s' {Ti|r2} F P {t} snapshot a; in s {t'} 

r h e : r 

r P { } insert e {r} Th" {t} delete {() } T P {n'M} rename n {n[r]} 

a <: ^ r P {a} s {r} a it: <(> F P {t} s {t'} 

rhi{a}0?s{r} r |J {a} fliTs {a} T 1^ {n[r]} children[s] {n[r']} 

Fh* {{)}s{t'} F[-* {{)}s{r'} F hter {r} s {r'} 

F P {t} left[s] {t',t} F P {t} right[s] {t,t'} T P {t} iter[s] {t'} 
r P {ti} s {tz} T2 <: T2 P{t) -.a ^(71^ a ai<:<7 Their 

rP {ti}s{T2} rP{cTl}P(e){cT2} 



rp.e. {r}.s{r'} 

F h' {q} S {r} r Pter {g(X)} S {t} F Pter {ti} S {tz} 

rPter {()}S{()} rPt„{a}s{T} T Pter {X} S {t} T Pt„ K } S {tI } 

F Pter {ri } S {r{ } r Pter {tz } S {t^ } P Pter {n } 5 {t^} T Pter {ti} S {t'^} 
F Pter {Tl,r2} S {t{,T^} T Pter {ri|T2} S {r{ 



r h p prog 



P not declared in p 

rP{ri}s{r2} P(r) : cri ^ (72 € Zi r,^:r P {cti} s {aa} T h p prog 

P h update s : Ti T2 prog P h declare procedure P{x : t) : Ti =J> T2 {s}; p prog 

Fig. 3. Update and additional program well-formedness rules 



updated to take these changes into account, perhaps using a judgment F \- s \ ( ) | r", 
where F' is the updated type environment reflecting the types of the variables after up- 
date s. This approach quickly becomes difficult to manage, especially if it is possible 
for different variables to "ahas", or refer to overlapping parts of the data accessible from 
r, and adding side-effecting functions further complicates matters. 

This is not the approach to update typechecking that is taken in Flux. Updates are 
syntactically distinct from queries, and a Flux update typechecking judgment such as 
r P {t} s {t'} assigns an update much richer type information that describes the 
type of part of the database before and after running s. The values of variables bound 
in are immutable in the variable's scope, so their types do not need to be updated. 
Similarly, procedures must be annotated with expected input and output types. We do 
not believe that these annotations are burdensome in a database setting since a typical 
update procedure would be expected to preserve the (usually fixed) type of the database. 
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h c[] : c[] 



[-* {()} insert c[] {b[],c[]} 
{bQ} right insert cQ {b[],c[]} 
P {b[]}b?s' {b[],c[]} 

Kte. {b[]}b?s' {b[],c[]} 

Kite. {bir}b?s' {{b[],c[]r} ^'{C[]}b?s' {C[]} 

Kite. {b[r,c[]} bis' m,cW,c[]} Kite. {c[] } {c[]} 

P {b[l%c[l}iter 

h^{a[b[]*,c[]l}childrenM {a[(b[l, cQ]} 

Ktsr {a[b[l*,c[]]} a?children[s] {a[(b[], c[])*, c[]]} hjter {t/[]} a?children[s] {rfQ} 
Kter {a[b[l*,c[]],rfO}a?childrenM {a[(bD, eD)% cD], dj]} 
P {a[6[]*,c[]],d[]} iter [a?children[s]] {a[(b[], cQ)*, cQ], d[]} 

Fig. 4. Example update derivation, where s' — right insert c[] and s = iter [bis'] 

leafupd[string) : Tree Tree £ Z\ tree[...] <: Tree a;:string h x : string 
a;:string P {tree[leaf[str±ng]\node[Tree*]]} leafupd{x) {Tree} 
a;:string Kter {tree[leaf[string]\node[Tree*]]} leafupd{x) {Tree} 
a;:string Kter {Tree} leafupd{x) {Tree} 
a;:string Kter {Tree*} leafupd{x) {Tree*} 
Tistring h* {Tree*} iter[leafupd{x)] {Tree*} 
Tistring h* {node[Tree*]} children[iter[Zea/iipd(3::)]] {node[Tree*]} 
2;:string h* { node [ Tree *] } 7iO(ie?children[iter[/ea/up(i(a::)]] {node[Tree*]} 

Fig. 5. Partial derivation for declaration of leafupd 



4.2 Examples 

The interesting rules are those involving iter, tests, and children, left/right, and 
insert/rename/delete. The following example should help illustrate how the rules 
work for these constructs. Consider the high-level update: 

insert after a/b value c[] 

which can be compiled to the following core Flux statement: 

iter [a? children [iter [6? right insert c[]]]] 

Intuitively, this update inserts a c after every b under a top-level a. Now consider the 
input type a[6[]*, c[]], Clearly, the output type shouldhe a[(6[], c[])*, c[]], c?[]. To see 
how Flux can assign this type to the update, consider the derivation shown in FigurelH 
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As a second example, consider the procedure declaration 

declare procedure leafupd{x:stT±ng) : Tree => Tree { 

iter[children[iter[Zea/?children[delete; insert x]; 

no(ie?childreii[iter[/ea/M]3d(a;)]]]]] 

}; 

This procedure updates all leaves of a tree to x. As with the recursive query discussed in 
Section [J!2l this procedure requires subtyping to typecheck the recursive call. We also 
need subtyping to check that the return type of the expression matches the declaration. 
A partial typing derivation for part of the body of the procedure involving a recursive 
call is shown in Figure |5] 

4.3 Decidability 

To decide typechecking, we must again carefully control the use of subsumption. The 
appropriate algorithmic typechecking judgment is defined as follows: 

Definition 2 (Algorithmic derivations for updates). The algorithmic typechecking 
judgments T l*-"" {r} s {r'} and T f*iter {t} s {t'} are obtained by taking the rules in 
Figure\3\ removing both subsumption rules, and replacing the procedure call rule with 

P{a) : a ^ a' e A t <: a F\*-e:T t <:a 
rk^r} P(e) W] 

Moreover, all subderivations of expression judgments in an algorithmic derivation of 
an update judgment must be algorithmic. 

The proof of completeness of algorithmic update typechecking has the same struc- 
ture as that for queries. We state the main results; proof details are in the appendix. 

Lemma 4 (Decidabilty for updates). Let a, s be given. Then there exist computable 
functions ja.s ond kg such that: 

1- ja,s{r, t) is the unique T2 such that T {ti} s {T2}, if it exists. 
2. ks{r, Ti) is the unique T2 such that T \*-±tex {''"1} s {T2}, if it exists. 

Theorem 3 (Algorithmic soundness for updates). (IjlfTl*^ {r} s {r'} is derivable 
then r {r} s {r'} is derivable. (2) If T h^iter {t} e {r'} is derivable then T h^^er 
{r} e {r'} is derivable. 

LemmaS (Downwardmonotonicity for updates). f7j///^ {ti} s {T2}andr' <: 
r and t{ <: ti then T' {t{} s {T2} for some <: T2. (2) If T kiter {n} s {T2} 
and r' <: T and t[ <: ti then T' l-^ter {''"1} s {T2} for some <: T2. 

Theorem 4 (Algorithmic completeness for updates). (1) If T ^ {ti} s {T2} then 
there exists T2 <: T2 such that T h*-" {ti} s {tj}. (2) If T Kter {''"i} s {T2} then there 
exists T2 <: T2 such that T h^iter {'''i} s {T2}. 
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5 Related and future work 



This work is directly motivated by our interest in using regular expression types for 
XML updates, using richer typing rules for iteration as found in /iXQ |4|. Fernandez, 
Simeon and Wadler |7| earlier considered an XML query language with more precise 
typechecking for iteration, but this proposal required many more type annotations than 
XQuery, /iXQ or Flux do; we only require annotations on function or procedure dec- 
larations. 

For brevity, the core languages in this paper omitted many features of full XQuery, 
such as the descendant, attribute, parent and sibling axes. The attribute axis is straight- 
forward since attributes always have text content. In /iXQ, the descendant axis was sup- 
ported by assigning x/descendant-or-self the type formed by taking the union 
of all tree types that are reachable from the type of x. XQuery handles other axes by 
discarding type information. Our algorithmic completeness proof still appears to work 
if these axes are added. 

We are also interested in extending the path correctness analysis introduced by Co- 
lazzo et al. to Flux. In the update setting, a natural form of path correctness might be 
that there are no statically "dead" updates. 

Flux represents a fundamental departure from the other XML update language 
proposals of which we are aware (such as XQuery! \TU\ and the draft W3C XQuery 
Update Facility |2|). To the best of our knowledge, static typechecking and subtyping 
have yet to be considered for such languages and seem likely to encounter difficulties for 
reasons we outlined in Section l^Tl and discussed in more depth in 1 3 1 . In addition. Flux 
satisfies many algebraic laws that can be used to rewrite updates without first needing 
to perform static analysis, whereas a sophisticated analysis needs to be performed in 
XQuery ! even to determine whether two query expressions can be reordered. We believe 
that this will enable aggressive update optimizations. 

On the other hand, XQuery! and related proposals are clearly more expressive than 
Flux, and have been incorporated into XML database systems such as Galax f6l. Al- 
though we currently have a prototype that implements the typechecking algorithm de- 
scribed here as well as the operational semantics described in |3 1, further work is needed 
to develop a robust implementation inside an XML database system that could be used 
to compare the scalability and optimizability of Flux with other proposals. 

6 Conclusions 

Static typechecking is important in a database setting because type (or "schema") in- 
formation is useful for optimizing queries and avoiding expensive run-time checks or 
re-validation. The XQuery standard, like other XML programming languages, employs 
regular expression types and subtyping. However, its approach to typechecking iteration 
constructs is imprecise, due to the use of "factoring" which discards information about 
the order of elements in the result of an iteration operation such as a f or-loop. While 
this imprecision may not be harmful for typical queries, it is disastrous for typechecking 
updates that are supposed to preserve the type of the database. 

In this paper we have considered more precise typing disciplines for XQuery-style 
iterative queries and updates in the core languages /zXQ and Flux respectively. In order 
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to ensure that these type systems are well-behaved and that typechecking is decidable, it 
is important to prove the completeness of an algorithmic presentation of typechecking 
in which the use of subtyping rules is Umited so that typechecking remains syntax- 
directed. We have shown how to do so for the core ^XQ and Flux languages, and 
beUeve the proof technique will extend to handle other features not included in the 
paper. These results provide a solid foundation for subtyping in XML query and update 
languages with precise iteration typechecking rules and for combining them with other 
XML programming paradigms based on regular expression types. 
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A Proofs from Sections 13.31 and 14.31 



A.l Regular languages and homomorphisms 

We assume familiarity with the theory of regular expressions and regular languages; in 
this case, we consider types r e Type to be regular languages over atomic types a G 
Atom. The language L{t) denoted by a type is therefore a set of sequences uj e Atom* 
of atomic types, where L : Type Atom* is defined as follows: 

i(()) = {()} 
L{a) — {a' I a <: a} 
L{t, t') = L{t) • i(r') ^{LO,Lo'\uje L{t),J G L{t')] 
L(t\t')^ L(t)\JL{t') 



L{t*)^L{t)* ^{]L{Tr 



1=0 

L{X) = L{E(X)) 

Note that this definition differs slightly from the usual definition of the language of a 
regular expression, in that we include all subtypes of atomic types a in L{a). 
It is straightforward to show the following useful properties of L: 

Lemma 6. L(r) = {uj \uj <: t} 

Proof. For both directions, proof is by induction on the structure of r. For the forward 
direction, we have: 

- Case ( ) : immediate 

- Case a: Suppose u G L{a). Clearly u) = a' <: a for some atomic a' . 

- Case Ti,T2: Suppose cj G L{ti,T2). By definition, u; — uji,uj2 where oji G L{Ti) 
for i G {1, 2}. Then by induction uji <: r,; for i G {1, 2}. Thus u!i,uj2 <'■ ti, T2. 

- Case ti\t2: Suppose ui G L{ti\t2). By definition, uo — LOi where lo G L{Ti) for 
some i G {1, 2}. Then by induction lo <: Ti for some i G {1, 2}. Thus lo <: ti \t2. 

- Case r*: Suppose lo G L{t*). By definition, lo — cji, . . . ,a;„ where n > and 
oJi G L{t) for alH G {1, . . . , n). Then by induction LOi <: t for alH G {1, . . . , n). 

Thus LO — UJl, . . . ,UJn <: T, . . . , T <: T*. 

- Case X: Immediate by induction. 
For the reverse direction, we have: 

- Case ( ) : immediate, since we must have lo = ( ) G L( ( ) ) 

- Case a: Suppose lo <: a. Clearly uj — a' <: a for some atomic a', so G L{a). 

- Case Ti , r2 : Suppose uj <: ti , T2 . Then since lo is atomic we must have lo ~ u!i,uj2 
where LOi <: Ti for i G {1, 2}. Thus lo = U!i,uj2 G T{ti) • L{t2) — L{ti,T2). 

- Case ti|t2: Since lo is atomic, uo <: ti\t2 implies that lo <: ri or cj <: T2. Thus 
w G L(ri)UL(T2) =-i(Ti|r2). 

- Case r*: Since w is atomic, we must have lo — loi, . . . ,uJn where LOi <: r; hence 
w = wi, . . . , w„ G L{t)* = L{t*). 
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- Case X: Immediate by induction. 

Lemma l.Ifv^ |r], then there exists a uj ^ L{t) such that v G |a;]. 
Proof. Induction on the structure of v,t. 

- Case ( ) , ( ) : Immediate; oj = {) works. 

- Case V, a: Immediate; uj = a works. 

- Case V, (ti, T2): We must have v — Vi, V2 where Vi £ |rj], for i G {1, 2}. Then by 
induction we have uji G L{Ti) with Vi G |t^i]; this implies v G |cji, LJ2] C |ti, T2]. 

- Case V, ti\t2: Without loss of generality, suppose v G It^]. Then by induction we 
havew G ^(tj) C L{ti\t2) such that w G |t^] C |ri|T2]. 

- Case v,T*:lfv = ( ) , then uj = {) works. Otherwise we must have v = vi, . . . ,Vn 
where Vi G |r]. Then by induction we have uji G L{t) with Vi G |w]; this implies 
thatwi, . . . ,w„ G L{t*) andv G {uJi, . . . ,cj„] C |t*]. 

- Case X: Immediate by induction. 

Lemma 8. For any t, t' , t <: t' if and only if L{t) C L{t') 

Proof. In the forward direction, if r <: r', then let uj G L{t) be given. Then uj <: t <: 
t'. Thus, u G L(t'). 

In the reverse direction, suppose that _L(t) C L(r'). Suppose u G |r] . Via Lemma|7] 
choose u such that v G [w] and w G L[t). Since _L(t) C L{t'), we have that uj <: r', 
so 1; G |w] C |r']. We conclude that |t] C |r'] so by definition r <: t'. 

We now recall properties of homomorphisms of regular type expressions. A (partial) 
homomorphism h : Type Type (or h : Type Type) is a (partial) function 
satisfying 

h{{))= 

hiT,T') = hiT),hiT') 

hir\r') = h{r)\h{T') 

h{T*) = hiry 
h{X) = h{E{X)) 

In particular, we consider (partial) homomorphisms that are generated entirely by their 
behavior on atoms, that is, given a (partial) function k : Atom Type, we construct 
the unique (partial) homomorphism k agreeing with k by taking k(a) = k{a) (when 
defined) and using the above equations in all other cases. 

We say that a (partial) function F : X ^ Y on ordered sets X, Y is downward 
closed if whenever x' <x x, and F{x) exists, then F{x') also exists; a downward 
closed function is downward monotonic if in addition F{x') <y ^(2^)- 

In the following, we use the notation F[~] : V{X) V{Y) for the partial function 
on sets obtained by lifting F : X ^ Y; F[S] is defined and equals {-F(s) | s G S"} 
provided F is defined on each element of S. It is easy to show that this operation is 
downward monotonic with respect to set inclusion and preserves totality (if F is total 
then F[—] is total also). 
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We need a second auxiliary function, namely the set of atoms appearing in a type. 
This is given by A : Type V{Atom), defined as follows: 



^(0) 


= {} 


A{a) 


— {a 1 a <: a} 


A(r,r') 


= A{t)UA{t') 


A{r\r') 


= A{t)UA{t') 


A{t*) 


= A{t) 


A{X) 


^A{E{X)) 



The following fact about A will be needed: 
Lemma 9. Ifr <: t' then A{t) C A{t'). 

Proof. Note that A{t) = IJ B[L{t)] where B : Atom* ViAtom) is defined by 
B(()) = {} 

B{auj) = {a' I a' <: a} U B{uj) 

and U : V{V{Atom)) V{Atom) is the usual flattening operator on sets. All three 
functions IJ, ], L are monotonic. 

Lemma 10. Let h : Atom Type be given. If h{a) is defined for each a G A(t) 
then h{T) is defined. 

Proof. By structural induction on t. The base case t = a is by definition of h{a) = 
h{a). The remaining cases are straightforward because his a homomorphism. 

Lemma 11. Ifh : Atom Type is downward closed, and /i(t) is defined, then h{a) 
is defined for every a £ A{t). 

Proof. By structural induction on r. For the base case r = a, we need downward 
closedness to conclude that h{a) is defined for each a' <: a. The remaining cases are 
straightforward because /i is a homomorphism. 

Lemma 12. Ifh : Atom Type is downward closed, then h is downward closed. 

Proof. Let r' <: t be given such that /i(t) is defined. Then by Lemma [TTl h{a) is 
defined on every a e A{t). But A{t') C A{t) (Lemma|9]l so by Lemma [TOl /i(r') is 
defined. 

Lemma 13. Suppose h : Atom Type is downward monotonic. Then for any t G 
dom(/i), 

\jL[h[L{T)]]=Lih{r)) (1) 
Proof. By induction on the structure of r. 
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- T = ( ) . Then 

U L[h[L{ ())]]= U L[h[{ {)}]] = U L[{h{ ())}]= U ^[{ }] 
= U{L(())} = L(()) = L(M())) 

- T = a. We need to show that IJ L[/i[L(a)]] L{h{a)). 

[jL[h[L{a)]] =\jL[h[{a' \ a' <: a}]] 
= \jL[{h{a') I a' <: a}] 
= \J{L{h{a')) I a' <: a} 

Now since h is downward monotonia and defined on a, for each a' < : a we have 
that h{a') <: h{a). Thus, L{h{a')) C L{h{a)), so U{i(/j(a')) I <: "} = 
L{h{a)), as desired. 

- T = Ti, r2. Then 

|jL[/.[L(ri,r2)]] ^ [j L[h[L{r,) . L{t^)]] ^[jL[h[L{T,)].h[L{T2)]] 

= [jL[h[L{T,)]].L[h[L{T^)]] = {\jL[h[L{T,)]]) . {y]L[h[L{T^)]]) 

= L{h{n)) . LCh{T2)) = LihiTl), h{T2)) = L{h{TuT2)) 

- T — Ti \t2. Then 

[jL[h[L{n\r2)]] =[jL[h[Lin)UL{r2)]] ^ \J L[h[Lin)] U h[L{r2)]] 

= \jL[h[Lin)]]UL[h[L{r2)]] ^ (]jL[h[L{n)]]) U (]J L[h[L{r2)]]) 

= L{h{n))ULCh{T2)) = LCh{Tl)\h{T2)) = L{h{Tl\T2)) 

- T = T*. 

\jL[h[LiT*)]] =\jL[h[L{rr]] =\jL[h[L{n)]*] 

= \jL[h[LinW = (]jL[h[Lir)]]y 
= L(Mri))*=L(Mr)*)=L(Mr*)) 

- T — X: Immediate by induction. 

Theorem 5. Ifh: Atom Type is downward monotonic, then ti is downward mono- 
tonic. 

Proof. Let t' <: r be given such that /i(r) is defined. By Lemma [T2l /i(r') is defined. 
We must show that ^(r') <: /i(r). Since t' <: t, by Lemma[8]we have L(t') C 
L{t). It follows from the monotonicity of IJ, L[-] and h[-] that [_] L[h[L{T')]] C 
U L[h[L{T)]]. By Lemma [T3l we have that L{h{T')) C L{h{T)), but by Lemma|5|this 
impUes that h{T') <: /i(t). 
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A.2 Proving algorithmic completeness 



The two key properties which ensure that occurrences of the subsumption rule can be 
eHminated from derivations are uniqueness of algorithmic types and downward mono- 
tonicity of the algorithmic judgments. 

Uniqueness, discussed already in proving decidability of the algorithmic judgments 
(Lemma [T] and Lemma HI, simply means that if the "inputs" to a judgment are fixed, 
then there is at most one "output" type derivable by algorithmic judgments; thus, the 
judgments define partial functions. Recall that for fixed x, e, n, a, s, we defined: 

1 ■ fn {ti ) as the unique T2 such that ri : : n r2 . 

2. ge{r) as the unique t such that k e : r (if it exists). 

3. hx,e{r, Ti) as the unique T2 such that F \*- x in ti ^ e : T2 (if it exists). 

4. ja,s{r, Ti) as the unique T2 such that F {ri} s {T2} (if it exists). 

5. ks{F, Ti) as the unique T2 such that F loiter {ti} s {T2} (if it exists). 

Downward monotonicity of the type judgments corresponds precisely to downward 
monotonicity of the above functions (where we use the subtyping order on context argu- 
ments F defined in Section lTTI ) To prove downward monotonicity of the type-directed 
/, h, k, we need to make use of the characterization of downward monotonicity for 
partial homomorphic extensions established in the last section. 

Proposition 1 (Downward Monotonicity). 

7. For every n, the function fn is downward monotonic. 

2. For every e and x, the functions andh^^e ore downward monotonic, andhx,e{F, —) 
is the partial homomorphic extension of ge{F, )). 

3. For every s and a, the functions ja,s and fcs <^re downward monotonic, and ks{F, — ) 
is the partial homomorphic extension of ji_s{r, — ). 

Proof. For part (1), we just need to show that /„ is generated by the function 



which is obviously downward monotonic. 

For part (2), proof is by induction on the structure of e. For each e, we first show 
downward monotonicity of g^ by inspecting derivations. We show a few representative 
examples: 

- Case (var): If the derivation is of the form 




71 [r] a = n[T] 
( ) otherwise 



x:t € F 
F^ x:t 



then we have x : t' ^ F' where r' <: r, hence may derive: 



x:t' £ F' 
F\^ x:t' 
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- Case (let): If the derivation is of the form 

r [*■ ei : Ti r, x:ti k 62 : T2 
r k let a; = ei in 62 : T2 

then by induction we have F' \*- ei : t{ for some t{ <: ti and since F' <: F, 
we have F',x:t[ <: F,x:ti, so also by induction F',x:t{ f*- 62 : T2 for some 
T2 <: T2. To conclude, we derive 

r' k ei : t( F', x:t[ ^» 62 : 
F' \*- let a; = ei in 62 : T2 

- Case (for): If the derivation is of the form 

-T ^> 61 : Ti r ^» 5 in Ti — > 62 : T2 
r' W for X G ei return 62 : T2 

then by induction we have F' I*- ei : t[ for some t[ <: ti. Using the downward 
monotonicity of /ix,e2> we can obtain <: T2 such that F [*■ x in t{ 62 ■ T2. 
To conclude, we derive 

r' [♦ 61 : t{ r' ^♦ X in t{ ^ 62 : 
r" f*- for a; e 61 return 62 : T2 

Showing that h^.e is downward monotonic is immediate once we show that hx,e{r, — ) 
is the partial homomorphic extension of ge {F,x:{—)) for any F. The latter property can 
be proved by induction on the structure of derivations of _r h x in ti ^ e : T2. The 
cases involving regular expression constructs or variables are straightforward, and the 
base case 

F, x:a h 6 : T 
F \- X ±na ^ e : T 

is also straightforward since hx,e{r, r) = ge{r, x:t) by definition. 

Similarly, for part (3), j and k, the proof is by induction on derivations. The cases 
involving J are straightforward; the case involving hitei is similar to that for for above. 
To show ks{F, — ) is the partial homomorphic extension of ji,s(^, — ) and hence that kg 
is downward monotonic, the proof is by simultaneous induction on derivations, just as 
for g and h above. 

By rewriting the above proposition in terms of judgments, we can conclude: 
Theorem 6 (Downward monotonicity). 

7. If Ti :: n ^ T2 and t[ <: ti then t[ :: n ^ T2 for some <: T2 

2. IfF k 6 : T and F' < : F then F' [*■ e : t' for some t' < : t. 

3. IfF k S in n — > 6 : T2 and F' <: F and t[ <: n then F' ^ x in t{ ^ e : T2 
for some T2 <: T2- 

4. IfF {n} s {T2} and F' <: F and t[ <: n then F' k"" {r{} s {t^} for some 

T2 <■■ T2. 
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5. Ifr kiter {ri} s {t2} und r' <: r and t{ <: n then F' h»iter {r{} s {t!^} for 
some T2 <: T2. 

Finally, taking F = F' and n = r{ in parts 2-5 above, we conclude: 
Theorem 7 (Algorithmic completeness). 

1. If F h e : T then there exists t' <: t such that F\*- e: t'. 

2. If F h X liiTi —> e : T2 then there exists T2 <: T2 such that r' k i in ti — > e : T2. 

3. If F {ti} s {T2} f/zen there exists <: r2 such that F {n} s {T2} 

4. 7/"/^ l^ter {ti} s {T2} f/ien ?/iere exists T2 <: T2 iwc/i that F hiter {ti} s {rg} 
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